Everyone has been affected by the COVID-19 pandemic in some manner or the other over the past two and a half years. Oil and gas become a pertinent factor for our study of Alberta because it is one of the greatest contributors to Alberta's GDP.
Alberta produces the majority of Canada's oil, therefore research into the oil and gas business is done to better understand how the province is affected during this pandemic.
Made By: - Swarnim Khosla
Topic: - Impact of COVID-19 on Alberta from Oil Price perspective.
Dataset Responsibility: - Gross Revenue, Net Revenue and Royalty paid to the Albertan government by Alberta's Oil and Gas companies. International prices of Oil price per barrel comparison with Gross Revenues of Alberta's Oil and Gas companies.
Note: - Visualisations at the end.
Skills demonstarted in the project: - Data Cleansing in Python (Pandas, Plotly, Numpy)
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
# Oil Price of past 10 years.
df_oil_price = pd.read_csv("wti-crude-oil-prices-10-year-daily-chart.csv", on_bad_lines='skip') # Oil Price of past 10 years.
display(df_oil_price)
# display(type(df_oil_price))
| date | value | |
|---|---|---|
| 0 | 2012-10-11 | 92.070 |
| 1 | 2012-10-12 | 91.860 |
| 2 | 2012-10-15 | 91.850 |
| 3 | 2012-10-16 | 92.090 |
| 4 | 2012-10-17 | 92.214 |
| ... | ... | ... |
| 2542 | 2022-10-27 | NaN |
| 2543 | 2022-10-28 | NaN |
| 2544 | 2022-10-29 | NaN |
| 2545 | 2022-10-30 | NaN |
| 2546 | 2022-10-31 | NaN |
2547 rows × 2 columns
# Parsing the Date column into Date-time format type.
#Citation: - https://stackoverflow.com/a/39206377/10912105
df_oil_price['date'] = pd.to_datetime(df_oil_price['date'])
# Only extracting the year from the date column.
df_oil_price['date'] = pd.to_datetime(df_oil_price['date']).dt.year
display(df_oil_price)
| date | value | |
|---|---|---|
| 0 | 2012 | 92.070 |
| 1 | 2012 | 91.860 |
| 2 | 2012 | 91.850 |
| 3 | 2012 | 92.090 |
| 4 | 2012 | 92.214 |
| ... | ... | ... |
| 2542 | 2022 | NaN |
| 2543 | 2022 | NaN |
| 2544 | 2022 | NaN |
| 2545 | 2022 | NaN |
| 2546 | 2022 | NaN |
2547 rows × 2 columns
# Ordering the DataFrame in descending order according to the "Date" column.
df_oil_price.sort_values(by = 'date', ascending = False, inplace = True)
df_oil_price.reset_index(drop=True, inplace=True)
display(df_oil_price)
| date | value | |
|---|---|---|
| 0 | 2022 | NaN |
| 1 | 2022 | 98.26 |
| 2 | 2022 | 98.54 |
| 3 | 2022 | 102.07 |
| 4 | 2022 | 103.79 |
| ... | ... | ... |
| 2542 | 2012 | 88.07 |
| 2543 | 2012 | 86.49 |
| 2544 | 2012 | 87.18 |
| 2545 | 2012 | 87.74 |
| 2546 | 2012 | 92.07 |
2547 rows × 2 columns
# Renaming column names.
df_oil_price.rename(columns = {'date':'year', ' value':'oil price ($/bbl)'}, inplace = True)
display(df_oil_price)
| year | oil price ($/bbl) | |
|---|---|---|
| 0 | 2022 | NaN |
| 1 | 2022 | 98.26 |
| 2 | 2022 | 98.54 |
| 3 | 2022 | 102.07 |
| 4 | 2022 | 103.79 |
| ... | ... | ... |
| 2542 | 2012 | 88.07 |
| 2543 | 2012 | 86.49 |
| 2544 | 2012 | 87.18 |
| 2545 | 2012 | 87.74 |
| 2546 | 2012 | 92.07 |
2547 rows × 2 columns
# Removing all the rows which has 0 or NaN / NA Oil price.
df_oil_price['oil price ($/bbl)'] = df_oil_price['oil price ($/bbl)'].fillna(0)
df_oil_price = df_oil_price[df_oil_price['oil price ($/bbl)'] != 0]
# Citation: - https://stackoverflow.com/a/46827856/10912105
# Taking the mean of the oil price per barrel according to each year.
aggregation_functions = {'oil price ($/bbl)': 'mean'}
df_oil_price = df_oil_price.groupby(['year'], as_index=False).aggregate(aggregation_functions).reindex(columns=df_oil_price.columns)
df_oil_price.head(10)
| year | oil price ($/bbl) | |
|---|---|---|
| 0 | 2012 | 87.883071 |
| 1 | 2013 | 98.068722 |
| 2 | 2014 | 92.849937 |
| 3 | 2015 | 48.834810 |
| 4 | 2016 | 43.559508 |
| 5 | 2017 | 50.886789 |
| 6 | 2018 | 64.881921 |
| 7 | 2019 | 57.054024 |
| 8 | 2020 | 39.683857 |
| 9 | 2021 | 68.169953 |
# Ordering the DataFrame in descending order according to the "year" column.
df_oil_price.sort_values(by = 'year', ascending = False, inplace = True)
df_oil_price.reset_index(drop=True, inplace=True)
display(df_oil_price)
| year | oil price ($/bbl) | |
|---|---|---|
| 0 | 2022 | 97.974518 |
| 1 | 2021 | 68.169953 |
| 2 | 2020 | 39.683857 |
| 3 | 2019 | 57.054024 |
| 4 | 2018 | 64.881921 |
| 5 | 2017 | 50.886789 |
| 6 | 2016 | 43.559508 |
| 7 | 2015 | 48.834810 |
| 8 | 2014 | 92.849937 |
| 9 | 2013 | 98.068722 |
| 10 | 2012 | 87.883071 |
# Dropping all the values in the DataFrame where year < 2016.
df_oil_price.drop(df_oil_price[(df_oil_price['year'] < 2016)].index, inplace=True)
display(df_oil_price)
# df_oil_price.to_csv(r'/Users/swarnimkhosla/Desktop/df_oil_price.csv', index=False)
| year | oil price ($/bbl) | |
|---|---|---|
| 0 | 2022 | 97.974518 |
| 1 | 2021 | 68.169953 |
| 2 | 2020 | 39.683857 |
| 3 | 2019 | 57.054024 |
| 4 | 2018 | 64.881921 |
| 5 | 2017 | 50.886789 |
| 6 | 2016 | 43.559508 |
In the next section, The following Data Cleaning steps were taken: -
Data Sourcing and Data Cleansing consists of around 80% of the code. The rest 20% of the code is dedicated to plotting the graph.
For plotting the graph, Python's Plotly was used.
A line graph is made above depicting the changes in gross revenue and net revenue of Oil and Gas Industries with respect to time. The amount of Royalty paid by the Oil and Gas industry to the Albertan government is also shown.
The Line graph is divided into 3 time periods: -
1) Pre Pandemic (2016-2019) 2) COVID-19 Pandemic (2019-2020) 3) Post Pandemic (2020-2021)
# Data Scourcing of all Excel Files.
# Oil Royalties data from 2016 - 2021.
df_oil_royalties_2021 = pd.DataFrame(pd.read_excel("energy-2021-project-data-for-open-data-as-20220509-12am.xlsx"))
df_oil_royalties_2020 = pd.DataFrame(pd.read_excel("energy-2020-project-data-for-open-data-as-20220509-12am.xlsx"))
df_oil_royalties_2019 = pd.DataFrame(pd.read_excel("energy-2019-project-data-for-open-data-as-20220509-12am.xlsx"))
df_oil_royalties_2018 = pd.DataFrame(pd.read_excel("energy-2018-project-data-for-open-data-as-20220509-12am.xlsx"))
df_oil_royalties_2017 = pd.DataFrame(pd.read_excel("energy-2017-project-data-for-open-data-as-20220509-12am.xlsx"))
df_oil_royalties_2016 = pd.DataFrame(pd.read_excel("energy-2016-project-data-for-open-data-as-20220509-12am.xlsx"))
# Combining all the Excel files into 1 DataFrame.
df_oil_royalties = pd.concat([df_oil_royalties_2021, df_oil_royalties_2020, df_oil_royalties_2019, df_oil_royalties_2018, df_oil_royalties_2017, df_oil_royalties_2016])
display(df_oil_royalties)
#display(type(df_oil_price))
| Project | Project Name | Operator ID | Operator Name | Reporting Year | Project Revenue ($) | Diluent Cost ($) | Gross Revenue ($) | Cleaned Crude Bitumen at RCP (barrels) | Gross Revenue ($/bbl) | ... | Return Allowance ($) | Other Costs ($) | Other Net Proceeds ($) | Net Revenue ($) | Royalty Type | Royalty Rate (%) | Royalty Payable ($) | Payout Status | Unrecovered Balance/Net Loss at EOP ($) | Footnote | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | CSR005 | Peace River | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 3.362250e+07 | 1.260828e+07 | 2.101422e+07 | 423410.30 | 49.63 | ... | 7.716787e+08 | 1050711.18 | 3.462709e+07 | 0 | GROSS | 5.00000 | 1050711.18 | PRE | 8.500042e+09 | NaN |
| 1 | CSR006 | Wolf Lake Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.111665e+08 | 3.161476e+07 | 7.955178e+07 | 1307696.75 | 60.83 | ... | 9.515779e+08 | 795517.78 | 1.116160e+08 | 0 | GROSS | 1.00000 | 795517.78 | PRE | 1.045282e+10 | NaN |
| 2 | CSR008 | Lindbergh Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 6.669183e+07 | 1.043573e+07 | 5.625610e+07 | 910303.39 | 61.80 | ... | 0.000000e+00 | 0.00 | 4.936310e+05 | 23424083 | NET | 29.73905 | 6966099.00 | POST | 0.000000e+00 | NaN |
| 3 | CSR009 | Elk Point Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 2.003346e+07 | 3.758719e+06 | 1.627474e+07 | 261309.55 | 62.28 | ... | 7.341199e+07 | 813736.88 | 1.846447e+05 | 0 | GROSS | 5.00000 | 813736.88 | PRE | 8.057690e+08 | NaN |
| 4 | OSR003 | John Lake | NaN | IPC CANADA LTD. | 2021 | 4.384320e+06 | 0.000000e+00 | 4.384320e+06 | 78512.67 | 55.84 | ... | 1.295152e+06 | 201313.99 | 1.048239e+05 | 0 | GROSS | 4.59168 | 201313.99 | PRE | 7.147117e+07 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 115 | OSR151 | Beartrap East | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2016 | 1.316094e+06 | 0.000000e+00 | 1.316094e+06 | 46110.27 | 28.54 | ... | 6.517610e+03 | 17455.73 | 0.000000e+00 | 0 | GROSS | 1.32633 | 17455.73 | PRE | 3.995295e+04 | Project achieved payout in reporting year, the... |
| 116 | OSR151 | Beartrap East | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2016 | 7.083270e+05 | 0.000000e+00 | 7.083270e+05 | 18385.59 | 38.53 | ... | 0.000000e+00 | 40009.00 | 0.000000e+00 | 256597 | NET | 25.17060 | 64587.00 | POST | 0.000000e+00 | Project achieved payout in reporting year, the... |
| 117 | OSR153 | WEST ELLS SAGD | NaN | SUNSHINE OILSANDS LTD. | 2016 | 1.550204e+06 | 1.287384e+06 | 2.726728e+05 | 41244.40 | 6.61 | ... | 1.203090e+07 | 4827.38 | 0.000000e+00 | 0 | GROSS | 1.77039 | 4827.38 | PRE | 6.568549e+08 | NaN |
| 118 | OSR155 | Sawn Lake Pilot | NaN | ANDORA ENERGY CORPORATION | 2016 | -5.864862e+04 | 0.000000e+00 | 2.250230e+03 | 38358.10 | 0.06 | ... | 3.021045e+05 | 22.50 | 0.000000e+00 | 0 | GROSS | 0.99990 | 22.50 | PRE | 4.704630e+07 | NaN |
| 119 | OSR156 | Bonnyville | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2016 | 3.123755e+08 | 1.233286e+08 | 1.890470e+08 | 6211522.44 | 30.43 | ... | 0.000000e+00 | 0.00 | 3.114540e+06 | 87142490 | NET | 25.17077 | 21934436.00 | POST | 0.000000e+00 | NaN |
707 rows × 22 columns
list(df_oil_royalties.columns) # Seeing all the column names
['Project', 'Project Name', 'Operator ID', 'Operator Name', 'Reporting Year', 'Project Revenue ($)', 'Diluent Cost ($)', 'Gross Revenue ($)', 'Cleaned Crude Bitumen at RCP (barrels)', 'Gross Revenue ($/bbl)', 'Operating Costs ($)', 'Capital Costs ($)', 'Return Allowance ($)', 'Other Costs ($)', 'Other Net Proceeds ($)', 'Net Revenue ($)', 'Royalty Type', 'Royalty Rate (%)', 'Royalty Payable ($)', 'Payout Status', 'Unrecovered Balance/Net Loss at EOP ($)', 'Footnote']
pd.options.display.max_columns = None # Seeing all the column names
df_oil_royalties.head(10)
| Project | Project Name | Operator ID | Operator Name | Reporting Year | Project Revenue ($) | Diluent Cost ($) | Gross Revenue ($) | Cleaned Crude Bitumen at RCP (barrels) | Gross Revenue ($/bbl) | Operating Costs ($) | Capital Costs ($) | Return Allowance ($) | Other Costs ($) | Other Net Proceeds ($) | Net Revenue ($) | Royalty Type | Royalty Rate (%) | Royalty Payable ($) | Payout Status | Unrecovered Balance/Net Loss at EOP ($) | Footnote | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | CSR005 | Peace River | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 3.362250e+07 | 12608277.70 | 2.101422e+07 | 423410.30 | 49.63 | 4.868818e+07 | 21164307.19 | 7.716787e+08 | 1050711.18 | 3.462709e+07 | 0 | GROSS | 5.00000 | 1050711.18 | PRE | 8.500042e+09 | NaN |
| 1 | CSR006 | Wolf Lake Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.111665e+08 | 31614761.52 | 7.955178e+07 | 1307696.75 | 60.83 | 1.220061e+08 | 34908197.58 | 9.515779e+08 | 795517.78 | 1.116160e+08 | 0 | GROSS | 1.00000 | 795517.78 | PRE | 1.045282e+10 | NaN |
| 2 | CSR008 | Lindbergh Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 6.669183e+07 | 10435731.00 | 5.625610e+07 | 910303.39 | 61.80 | 2.309453e+07 | 10231118.00 | 0.000000e+00 | 0.00 | 4.936310e+05 | 23424083 | NET | 29.73905 | 6966099.00 | POST | 0.000000e+00 | NaN |
| 3 | CSR009 | Elk Point Crown Agreement | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 2.003346e+07 | 3758718.93 | 1.627474e+07 | 261309.55 | 62.28 | 8.621470e+06 | 4334143.37 | 7.341199e+07 | 813736.88 | 1.846447e+05 | 0 | GROSS | 5.00000 | 813736.88 | PRE | 8.057690e+08 | NaN |
| 4 | OSR003 | John Lake | NaN | IPC CANADA LTD. | 2021 | 4.384320e+06 | 0.00 | 4.384320e+06 | 78512.67 | 55.84 | 2.145084e+06 | 23680.25 | 1.295152e+06 | 201313.99 | 1.048239e+05 | 0 | GROSS | 4.59168 | 201313.99 | PRE | 7.147117e+07 | NaN |
| 5 | OSR004 | Seibert Lake | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.304516e+08 | 24355774.00 | 1.060959e+08 | 1715567.49 | 61.84 | 3.216933e+07 | 11587974.00 | 0.000000e+00 | 0.00 | 3.431550e+05 | 62681725 | NET | 31.35769 | 19655541.00 | POST | 0.000000e+00 | NaN |
| 6 | OSR005 | Ft. Kent | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 9.884670e+05 | 186138.00 | 8.023290e+05 | 12917.54 | 62.11 | 3.148350e+05 | 83849.00 | 3.607500e+04 | 3035352.00 | 0.000000e+00 | 0 | GROSS | 4.39072 | 35228.00 | POST | 2.667782e+06 | NaN |
| 7 | OSR006 | Brintnell | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.830783e+08 | 26368630.00 | 1.567096e+08 | 2511347.14 | 62.40 | 2.543064e+07 | 14046338.00 | 0.000000e+00 | 0.00 | 1.237063e+06 | 118469719 | NET | 31.35769 | 37149367.00 | POST | 0.000000e+00 | NaN |
| 8 | OSR007 | Frog Lake | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 6.712300e+05 | 127608.00 | 5.436220e+05 | 8665.81 | 62.73 | 5.865690e+05 | 108964.00 | 2.475900e+04 | 2082820.00 | 0.000000e+00 | 0 | GROSS | 4.39073 | 23869.00 | POST | 2.259490e+06 | NaN |
| 9 | OSR008 | Elk Point | NaN | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 0.000000e+00 | 0.00 | 0.000000e+00 | 0.00 | 0.00 | 1.113761e+07 | 1575327.00 | 6.567270e+05 | 55187453.00 | 1.376825e+07 | 0 | NET | 0.00000 | 0.00 | POST | 5.478887e+07 | NaN |
# Taking out only the needed columns.
# Data Cleaning of all Excel Files.
df_oil_royalties = df_oil_royalties.filter(['Operator Name', 'Reporting Year', 'Gross Revenue ($)', 'Gross Revenue ($/bbl)', 'Net Revenue ($)', 'Royalty Payable ($)'])
df_oil_royalties.head(10)
# display(type(df_oil_price))
| Operator Name | Reporting Year | Gross Revenue ($) | Gross Revenue ($/bbl) | Net Revenue ($) | Royalty Payable ($) | |
|---|---|---|---|---|---|---|
| 0 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 2.101422e+07 | 49.63 | 0 | 1050711.18 |
| 1 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 7.955178e+07 | 60.83 | 0 | 795517.78 |
| 2 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 5.625610e+07 | 61.80 | 23424083 | 6966099.00 |
| 3 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.627474e+07 | 62.28 | 0 | 813736.88 |
| 4 | IPC CANADA LTD. | 2021 | 4.384320e+06 | 55.84 | 0 | 201313.99 |
| 5 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.060959e+08 | 61.84 | 62681725 | 19655541.00 |
| 6 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 8.023290e+05 | 62.11 | 0 | 35228.00 |
| 7 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 1.567096e+08 | 62.40 | 118469719 | 37149367.00 |
| 8 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 5.436220e+05 | 62.73 | 0 | 23869.00 |
| 9 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 0.000000e+00 | 0.00 | 0 | 0.00 |
# Converting values from Scientific notation to Regular Integer notation.
df_oil_royalties[['Gross Revenue ($)', 'Net Revenue ($)', 'Royalty Payable ($)']] = df_oil_royalties[['Gross Revenue ($)', 'Net Revenue ($)', 'Royalty Payable ($)']].astype('int64')
df_oil_royalties.head(10)
# display(type(df_oil_price))
| Operator Name | Reporting Year | Gross Revenue ($) | Gross Revenue ($/bbl) | Net Revenue ($) | Royalty Payable ($) | |
|---|---|---|---|---|---|---|
| 0 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 21014223 | 49.63 | 0 | 1050711 |
| 1 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 79551776 | 60.83 | 0 | 795517 |
| 2 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 56256101 | 61.80 | 23424083 | 6966099 |
| 3 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 16274737 | 62.28 | 0 | 813736 |
| 4 | IPC CANADA LTD. | 2021 | 4384320 | 55.84 | 0 | 201313 |
| 5 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 106095871 | 61.84 | 62681725 | 19655541 |
| 6 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 802329 | 62.11 | 0 | 35228 |
| 7 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 156709635 | 62.40 | 118469719 | 37149367 |
| 8 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 543622 | 62.73 | 0 | 23869 |
| 9 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 0 | 0.00 | 0 | 0 |
# Removing all the rows which has 0 or NaN / NA Gross Revenue.
df_oil_royalties['Gross Revenue ($)'] = df_oil_royalties['Gross Revenue ($)'].fillna(0)
df_oil_royalties = df_oil_royalties[df_oil_royalties['Gross Revenue ($)'] != 0]
df_oil_royalties.head(10)
# display(type(df_oil_price))
| Operator Name | Reporting Year | Gross Revenue ($) | Gross Revenue ($/bbl) | Net Revenue ($) | Royalty Payable ($) | |
|---|---|---|---|---|---|---|
| 0 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 21014223 | 49.63 | 0 | 1050711 |
| 1 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 79551776 | 60.83 | 0 | 795517 |
| 2 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 56256101 | 61.80 | 23424083 | 6966099 |
| 3 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 16274737 | 62.28 | 0 | 813736 |
| 4 | IPC CANADA LTD. | 2021 | 4384320 | 55.84 | 0 | 201313 |
| 5 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 106095871 | 61.84 | 62681725 | 19655541 |
| 6 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 802329 | 62.11 | 0 | 35228 |
| 7 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 156709635 | 62.40 | 118469719 | 37149367 |
| 8 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 543622 | 62.73 | 0 | 23869 |
| 10 | CANADIAN NATURAL RESOURCES LIMITED | 2021 | 9453622 | 61.88 | 3346188 | 1049287 |
# Citation: - https://stackoverflow.com/a/46827856/10912105
# Taking the sum of the Gross Revenue, Net Revenue, and Royalty Payable. Taking the mean of Gross Revenue ($/bbl). Grouping by Year and Operator Name.
aggregation_functions = {'Gross Revenue ($)': 'sum', 'Gross Revenue ($/bbl)': 'mean', 'Net Revenue ($)': 'sum', 'Royalty Payable ($)' : 'sum'}
df_oil_royalties = df_oil_royalties.groupby(['Operator Name','Reporting Year'], as_index=False).aggregate(aggregation_functions).reindex(columns=df_oil_royalties.columns)
df_oil_royalties.head(10)
# display(type(df_oil_price))
| Operator Name | Reporting Year | Gross Revenue ($) | Gross Revenue ($/bbl) | Net Revenue ($) | Royalty Payable ($) | |
|---|---|---|---|---|---|---|
| 0 | ANDORA ENERGY CORPORATION | 2016 | 2250 | 0.060000 | 0 | 22 |
| 1 | ATHABASCA OIL CORPORATION | 2016 | 166527695 | 15.460000 | 0 | 2790687 |
| 2 | ATHABASCA OIL CORPORATION | 2017 | 299572563 | 26.740000 | 0 | 6853128 |
| 3 | ATHABASCA OIL CORPORATION | 2018 | 236621032 | 22.495000 | 0 | 11773851 |
| 4 | ATHABASCA OIL CORPORATION | 2019 | 343442530 | 34.520000 | 0 | 11886024 |
| 5 | ATHABASCA OIL CORPORATION | 2020 | 144603288 | 14.805000 | 0 | 2139280 |
| 6 | ATHABASCA OIL CORPORATION | 2021 | 482041769 | 48.570000 | 0 | 22432576 |
| 7 | BAYTEX ENERGY LTD. | 2016 | 110750464 | 21.583636 | 46892668 | 12102908 |
| 8 | BAYTEX ENERGY LTD. | 2017 | 178415040 | 35.053636 | 85975269 | 24634793 |
| 9 | BAYTEX ENERGY LTD. | 2018 | 158958504 | 32.246667 | 73201178 | 25765232 |
# Ordering the DataFrame in descending order according to column name "Reporting Year".
df_oil_royalties.sort_values(by = 'Reporting Year', ascending = False, inplace = True)
df_oil_royalties.reset_index(drop=True, inplace=True)
df_oil_royalties
# df_oil_royalties.to_csv(r'/Users/swarnimkhosla/Desktop/oil_royalties.csv', index=False)
| Operator Name | Reporting Year | Gross Revenue ($) | Gross Revenue ($/bbl) | Net Revenue ($) | Royalty Payable ($) | |
|---|---|---|---|---|---|---|
| 0 | IPC CANADA LTD. | 2021 | 19177470 | 56.380000 | 0 | 898147 |
| 1 | BIGHORN ENERGY CORPORATION | 2021 | 2686196 | 61.890000 | 762698 | 239164 |
| 2 | CONOCOPHILLIPS CANADA RESOURCES CORP. | 2021 | 2777321976 | 53.460000 | 0 | 127922018 |
| 3 | GEAR ENERGY LTD. | 2021 | 1045139 | 56.890000 | 93499 | 55524 |
| 4 | MEG ENERGY CORP. | 2021 | 1729190390 | 50.220000 | 0 | 80598736 |
| ... | ... | ... | ... | ... | ... | ... |
| 138 | IPC CANADA LTD. | 2016 | 14218887 | 29.005000 | 0 | 218521 |
| 139 | MEG ENERGY CORP. | 2016 | 519720341 | 17.530000 | 0 | 8505512 |
| 140 | OBSIDIAN ENERGY LTD. | 2016 | 47290515 | 19.566667 | 0 | 796932 |
| 141 | OSUM PRODUCTION CORP. | 2016 | 76615234 | 29.080000 | 0 | 1205811 |
| 142 | HUSKY OIL OPERATIONS LIMITED | 2016 | 435622907 | 24.490000 | 30420 | 7136857 |
143 rows × 6 columns
# Taking subset of the main DataFrame df_oil_royalties in order to show Gross Revenue separately.
df_gross_year = df_oil_royalties[['Reporting Year', 'Gross Revenue ($)']]
display(df_gross_year)
| Reporting Year | Gross Revenue ($) | |
|---|---|---|
| 0 | 2021 | 19177470 |
| 1 | 2021 | 2686196 |
| 2 | 2021 | 2777321976 |
| 3 | 2021 | 1045139 |
| 4 | 2021 | 1729190390 |
| ... | ... | ... |
| 138 | 2016 | 14218887 |
| 139 | 2016 | 519720341 |
| 140 | 2016 | 47290515 |
| 141 | 2016 | 76615234 |
| 142 | 2016 | 435622907 |
143 rows × 2 columns
# Summing up the Gross Revenue year wise.
aggregation_functions = {'Gross Revenue ($)': 'sum'}
df_gross_year = df_gross_year.groupby(['Reporting Year'], as_index=False).aggregate(aggregation_functions).reindex(columns = df_gross_year.columns)
display(df_gross_year)
# df_gross_year.to_csv(r'/Users/swarnimkhosla/Desktop/gross_year.csv', index=False)
| Reporting Year | Gross Revenue ($) | |
|---|---|---|
| 0 | 2016 | 22659123750 |
| 1 | 2017 | 37017041140 |
| 2 | 2018 | 37178962075 |
| 3 | 2019 | 51268667790 |
| 4 | 2020 | 24805669463 |
| 5 | 2021 | 63423778139 |
# Taking subset of the main DataFrame df_oil_royalties in order to show Gross Revenue per Barrel of Oil.
df_gross_per_bbp_year = df_oil_royalties[['Reporting Year', 'Gross Revenue ($/bbl)']]
display(df_gross_per_bbp_year)
| Reporting Year | Gross Revenue ($/bbl) | |
|---|---|---|
| 0 | 2021 | 56.380000 |
| 1 | 2021 | 61.890000 |
| 2 | 2021 | 53.460000 |
| 3 | 2021 | 56.890000 |
| 4 | 2021 | 50.220000 |
| ... | ... | ... |
| 138 | 2016 | 29.005000 |
| 139 | 2016 | 17.530000 |
| 140 | 2016 | 19.566667 |
| 141 | 2016 | 29.080000 |
| 142 | 2016 | 24.490000 |
143 rows × 2 columns
# Taking the mean of the Gross Revenue per Barrel, year wise.
aggregation_functions = {'Gross Revenue ($/bbl)': 'mean'}
df_gross_per_bbp_year = df_gross_per_bbp_year.groupby(['Reporting Year'], as_index=False).aggregate(aggregation_functions).reindex(columns = df_gross_per_bbp_year.columns)
display(df_gross_per_bbp_year)
# df_gross_per_bbp_year.to_csv(r'/Users/swarnimkhosla/Desktop/df_gross_per_bbp_year.csv', index=False)
| Reporting Year | Gross Revenue ($/bbl) | |
|---|---|---|
| 0 | 2016 | 20.964915 |
| 1 | 2017 | 33.936117 |
| 2 | 2018 | 30.448446 |
| 3 | 2019 | 41.111914 |
| 4 | 2020 | 20.186641 |
| 5 | 2021 | 50.693179 |
# Taking subset of the main DataFrame df_oil_royalties in order to show Net Revenue separately.
df_net_year = df_oil_royalties[['Reporting Year', 'Net Revenue ($)']]
display(df_net_year)
| Reporting Year | Net Revenue ($) | |
|---|---|---|
| 0 | 2021 | 0 |
| 1 | 2021 | 762698 |
| 2 | 2021 | 0 |
| 3 | 2021 | 93499 |
| 4 | 2021 | 0 |
| ... | ... | ... |
| 138 | 2016 | 0 |
| 139 | 2016 | 0 |
| 140 | 2016 | 0 |
| 141 | 2016 | 0 |
| 142 | 2016 | 30420 |
143 rows × 2 columns
# Summing up the Net Revenue year wise.
aggregation_functions = {'Net Revenue ($)': 'sum'}
df_net_year = df_net_year.groupby(['Reporting Year'], as_index=False).aggregate(aggregation_functions).reindex(columns = df_net_year.columns)
display(df_net_year)
# df_net_year.to_csv(r'/Users/swarnimkhosla/Desktop/df_net_year.csv', index=False)
| Reporting Year | Net Revenue ($) | |
|---|---|---|
| 0 | 2016 | 2482747717 |
| 1 | 2017 | 7174668076 |
| 2 | 2018 | 4934120605 |
| 3 | 2019 | 12990816071 |
| 4 | 2020 | 3511522049 |
| 5 | 2021 | 20201825187 |
# Taking subset of the main DataFrame df_oil_royalties in order to show the royalty paid to the government of Alberta.
df_royalty_year = df_oil_royalties[['Reporting Year', 'Royalty Payable ($)']]
display(df_royalty_year)
| Reporting Year | Royalty Payable ($) | |
|---|---|---|
| 0 | 2021 | 898147 |
| 1 | 2021 | 239164 |
| 2 | 2021 | 127922018 |
| 3 | 2021 | 55524 |
| 4 | 2021 | 80598736 |
| ... | ... | ... |
| 138 | 2016 | 218521 |
| 139 | 2016 | 8505512 |
| 140 | 2016 | 796932 |
| 141 | 2016 | 1205811 |
| 142 | 2016 | 7136857 |
143 rows × 2 columns
# Summing up the Royalty to the government paid year wise.
# Printing the data frame: oil royalties by year
aggregation_functions = {'Royalty Payable ($)': 'sum'}
df_royalty_year = df_royalty_year.groupby(['Reporting Year'], as_index=False).aggregate(aggregation_functions).reindex(columns = df_royalty_year.columns)
display(df_royalty_year)
# df_royalty_year.to_csv(r'/Users/swarnimkhosla/Desktop/df_royalty_year.csv', index=False)
| Reporting Year | Royalty Payable ($) | |
|---|---|---|
| 0 | 2016 | 843040394 |
| 1 | 2017 | 2480251532 |
| 2 | 2018 | 2738473691 |
| 3 | 2019 | 4836819255 |
| 4 | 2020 | 1067122211 |
| 5 | 2021 | 7703784912 |
fig = px.line(df_gross_year, x = "Reporting Year", y = "Gross Revenue ($)", title = "Gross Revenue with respect to Time")
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.show()
fig = px.line(df_gross_per_bbp_year, x = "Reporting Year", y = "Gross Revenue ($/bbl)", title = "Gross Revenue per Barrel with respect to Time")
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.show()
fig = px.line(df_net_year, x = "Reporting Year", y = "Net Revenue ($)", title = "Net Revenue with respect to Time")
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.show()
fig = px.line(df_royalty_year, x = "Reporting Year", y = "Royalty Payable ($)", title = "Royalty paid to the Govt. with respect to Time")
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.show()
# Showing Gross Revenue, Net Revenue and Royalty payable together.
x = df_gross_year['Reporting Year']
y1 = df_gross_year['Gross Revenue ($)']
y2 = df_net_year['Net Revenue ($)']
y3 = df_royalty_year['Royalty Payable ($)']
# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x = x, y = y1,
mode='lines+markers',
name='Gross Revenue'))
fig.add_trace(go.Scatter(x = x, y = y2,
mode='lines+markers',
name='Net Revenue'))
fig.add_trace(go.Scatter(x = x, y = y3,
mode='lines+markers', name='Royalty'))
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.update_layout(
title = "Gross Revenue, Net Revenue and Royalty with respect to Time",
xaxis_title = "Year",
yaxis_title = "($)",
legend_title="Legend",
font=dict(
family="Courier New, monospace",
size=12,
color="Black"
)
)
fig.show()
# Now showing the relationship between Oil price per Barrel and Gross Revenue per Barrel.
x = df_gross_per_bbp_year['Reporting Year']
y1 = df_oil_price['oil price ($/bbl)']
y2 = df_gross_per_bbp_year['Gross Revenue ($/bbl)']
df_oil_price.sort_values(by = 'year', ascending = True, inplace = True)
# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x = x, y = y1,
mode='lines+markers',
name='Oil Price per Barrel'))
fig.add_trace(go.Scatter(x = x, y = y2,
mode='lines+markers',
name='Gross Revenue per Barrel'))
fig.add_vrect(x0=2019, x1=2020,
annotation_text="COVID-19 Pandemic", annotation_position="top",
fillcolor="green", opacity=0.25, line_width=1)
fig.update_layout(
title = "Oil Price and Gross Revenue per Barrel",
xaxis_title = "Year",
yaxis_title = "$/bbl",
legend_title="Legend",
font=dict(
family="Courier New, monospace",
size=12,
color="Black"
)
)
fig.show()
The above Oil Recovery graph was made with the data which was sourced/extracted from of 1 csv and 6 Excel files. The irrelevant data was deleted from the DataFrame and only the useful columns were kept.
As we can observe from the graph, The gross revenue earned by Oil and Gas had steadily been increasing during pre pandemic times.
During the COVID-19 Pandemic, there has been a sharp decrease in Gross Revenue earned by Oil and Gas. This is due to reduce demand for oil.
There is a sharp growth/rebound in Gross Revenue earned by Oil and Gas industry due to resumed demand for oil and gas in the post pandemic phase.
The same can also be observed for Net revenue earned by Oil and Gas, and the Royalty which was paid to the Alberta's government for the following time period.
Key takeaways:
1) COVID-19 pandemic hit on revenues 2) High rebound at 2020-2021